Divergence estimation for multidimensional densities via k-nearest-neighbor distances
نویسندگان
چکیده
A new universal estimator of divergence is presented for multidimensional continuous densities based on -nearest-neighbor ( -NN) distances. Assuming independent and identically distributed (i.i.d.) samples, the new estimator is proved to be asymptotically unbiased and mean-square consistent. In experiments with high-dimensional data, the -NN approach generally exhibits faster convergence than previous algorithms. It is also shown that the speed of convergence of the -NN method can be further improved by an adaptive choice of .
منابع مشابه
Bias Reduction and Metric Learning for Nearest-Neighbor Estimation of Kullback-Leibler Divergence
Asymptotically unbiased nearest-neighbor estimators for KL divergence have recently been proposed and demonstrated in a number of applications. With small sample sizes, however, these nonparametric methods typically suffer from high estimation bias due to the non-local statistics of empirical nearest-neighbor information. In this paper, we show that this non-local bias can be mitigated by chang...
متن کاملFast Parallel Estimation of High Dimensional Information Theoretical Quantities with Low Dimensional Random Projection Ensembles
Goal: estimation of high dimensional information theoretical quantities (entropy, mutual information, divergence). • Problem: computation/estimation is quite slow. • Consistent estimation is possible by nearest neighbor (NN) methods [1] → pairwise distances of sample points: – expensive in high dimensions [2], – approximate isometric embedding into low dimension is possible (Johnson-Lindenstrau...
متن کاملOn the Estimation of alpha-Divergences
We propose new nonparametric, consistent Rényi-α and Tsallis-α divergence estimators for continuous distributions. Given two independent and identically distributed samples, a “naïve” approach would be to simply estimate the underlying densities and plug the estimated densities into the corresponding formulas. Our proposed estimators, in contrast, avoid density estimation completely, estimating...
متن کاملSoftware Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms
A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...
متن کاملk-Nearest Neighbor Based Consistent Entropy Estimation for Hyperspherical Distributions
A consistent entropy estimator for hyperspherical data is proposed based on the k-nearest neighbor (knn) approach. The asymptotic unbiasedness and consistency of the estimator are proved. Moreover, cross entropy and Kullback-Leibler (KL) divergence estimators are also discussed. Simulation studies are conducted to assess the performance of the estimators for models including uniform and von Mis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Information Theory
دوره 55 شماره
صفحات -
تاریخ انتشار 2009